Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Lucene's new SimpleQueryParser #4159

Closed
dakrone opened this issue Nov 12, 2013 · 12 comments
Closed

Add support for Lucene's new SimpleQueryParser #4159

dakrone opened this issue Nov 12, 2013 · 12 comments

Comments

@dakrone
Copy link
Member

dakrone commented Nov 12, 2013

Lucene's new SimpleQueryParser is designed to be able to parse human-entered queries without throwing any exceptions, we should add this and expose it in Elasticsearch.

See https://issues.apache.org/jira/browse/LUCENE-5336

@ghost ghost assigned dakrone Nov 12, 2013
@uboness
Copy link
Contributor

uboness commented Nov 12, 2013

nice one! should probably be named simple_query_string

@kimchy
Copy link
Member

kimchy commented Nov 13, 2013

I think this potentially will end up being the more common "free" text option, so simple_query_string as a name is not good. We can either have a new (better) name, have an option on match query to use it, or have an option on query_string to use it.

@clintongormley thoughts?

@clintongormley
Copy link

I think the match query already does a lot of stuff and we shouldn't overload it with parsing query strings as well.

I'd like to to see a new query dedicated to parsing query languages, not just this simple one. It should support different "dialects", and each dialect may have different flags and options that can be configured. Perhaps the parsed query? Or human? search_box?

/me is short on inspiration here

The query_string query already has a lot of options and a whole lot of history, so i would prefer to avoid it too. That said, a number of those parameters are things which we would need to add to this new multi-dialect query.

@nik9000
Copy link
Member

nik9000 commented Nov 13, 2013

I could see a parsed or a smart query type with with a dialect option. I can imagine the documentation page for it listing all the options and all the dialects to which they apply. It would help to make sure that all the options that do the same thing have the same name.

@s1monw
Copy link
Contributor

s1monw commented Nov 13, 2013

oh yeah smart comes right after fast :)

@nik9000
Copy link
Member

nik9000 commented Nov 13, 2013

Can the dialect for query_string be cranky?

In all seriousness, smart is a very marketing-y name and I withdraw my pseudo-recommendation for it. I don't like parsed though. interpreted has baggage and isn't much better than parsed from a "what does this thing do?"

I mean, what word say "explodes user provided text into potentially complex query?" exploded?

@clintongormley
Copy link

dwim? :)

@dakrone
Copy link
Member Author

dakrone commented Nov 14, 2013

I have this implemented (without tests yet) as the simple query type here: https://github.com/dakrone/elasticsearch/compare/add-simple-query-parser

Here's what this looks like so far:

In simplest form:

{
  "query": {
    "simple": {
      "description": "foo bar"
    }
  }
}

And with all the options:

{
  "query": {
    "simple": {
      "query": "foo bar",
      "fields": ["body^3", "name"],
      "default_operator": "and",
      "analyzer": "myanalyzer"
    }
  }
}

How does this look? I'd like to nail down a name/syntax before writing the tests/documentation/javadocs and submitting a pull request.

@dakrone
Copy link
Member Author

dakrone commented Nov 14, 2013

Here's my thoughts on why it should be a separate query:

  • It doesn't fit in query_string for me since you can't prefix terms with a field like name:foo, and doesn't support the same query syntax (which could be really confusing).
  • It doesn't fit in match for me since it includes "special" characters, ie word + (thing | "other stuff") -bad

@clintongormley
Copy link

Hi @dakrone

I just want to make sure that, whatever name we choose, we don't end up blocking the possibility of adding other dialects later on. Some of those dialects MAY understand the field: prefix, or handle wildcards, fuzzy etc, so I don't want the form of this simple query parser to be the only thing we consider when choosing a name.

I do agree that this simple syntax probably does make sense as the default dialect in whatever this query is called.

Regarding the short form of the query:

"query": {
    "name_to_be_chosen": {
        "description": "foo bar"
    }
}

The simple dialect doesn't have the field: prefix, so this form looks fine. However it is likely that other dialects will have field: . We're just about to deprecate the field query in favour of just the query_string query for exactly this reason. So I'm wondering whether we should support the short form or not.

Thoughts?

@dakrone
Copy link
Member Author

dakrone commented Nov 14, 2013

So I'm wondering whether we should support the short form or not.
Thoughts?

I'd be in favor of that, the only reason I actually added it was to "simplify" this query to try and get it down to something that looks like a term/match query. I understand the concern for future dialects, so I can go either way.

@clintongormley
Copy link

So then all that's left is choosing the name :)

syntax ? parsed? dsl? bazinga?

dakrone added a commit to dakrone/elasticsearch that referenced this issue Dec 10, 2013
This adds support for Lucene's SimpleQueryParser by adding a new type
of query called the `simple_query_string`. The `simple_query_string`
query is designed to be able to parse human-entered queries without
throwing any exceptions.

Resolves elastic#4159
dakrone added a commit that referenced this issue Dec 12, 2013
This adds support for Lucene's SimpleQueryParser by adding a new type
of query called the `simple_query_string`. The `simple_query_string`
query is designed to be able to parse human-entered queries without
throwing any exceptions.

Resolves #4159
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
This adds support for Lucene's SimpleQueryParser by adding a new type
of query called the `simple_query_string`. The `simple_query_string`
query is designed to be able to parse human-entered queries without
throwing any exceptions.

Resolves elastic#4159
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants